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Despite immense variability across languages, people can learn to 
understand any human language, spoken or signed. What neural 
mechanisms allow people to comprehend language across sensory 
modalities? When people listen to speech, electrophysiological os- 
cillations in auditory cortex entrain to slow (<8 Hz) fluctuations in the 
acoustic envelope. Entrainment to the speech envelope may reflect 
mechanisms specialized for auditory perception. Alternatively, flexi- 
ble entrainment may be a general-purpose cortical mechanism that 
optimizes sensitivity to rhythmic information regardless of modality. 
Here we test these proposals by examining cortical coherence to vi- 
sual information in sign language. First, we develop a metric to quan- 
tify visual change over time. We find quasi-periodic fluctuations in 
sign language, characterized by lower frequencies than fluctuations 
in speech. Next, we test for entrainment of neural oscillations to vi- 
sual change in sign language, using electroencephalography (EEG) 
in fluent speakers of American Sign Language (ASL) as they watch 
videos in ASL. We find significant cortical entrainment to visual os- 
cillations in sign language below 5 Hz, peaking at about 1 Hz. Coher- 
ence to sign is strongest over occipital and parietal cortex, in con- 
trast to speech, where coherence is strongest over the auditory cor- 
tex. Non-signers also show coherence to sign language, but entrain- 
ment at frontal sites is reduced relative to fluent signers. These re- 
sults demonstrate that flexible cortical entrainment to language does 
not depend on neural processes that are specific to auditory speech 
perception. Low-frequency oscillatory entrainment may reflect a gen- 
eral cortical mechanism that maximizes sensitivity to informational 
peaks in time-varying signals. 
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JL anevees differ dramatically from one another, yet peo- 
ple can learn to understand any natural language. What 
neural mechanisms allow humans to understand the vast di- 
versity of languages, and to distinguish linguistic signal from 
noise? One mechanism that has been implicated in language 
comprehension is neural entrainment to the volume envelope 
of speech. The volume envelope of speech fluctuates at low 
frequencies (< 8 Hz), decreasing at boundaries between sylla- 
bles, words, and phrases. When people listen to speech, neural 
oscillations in the delta (1-4 Hz) and theta bands (4-8 Hz) 
become entrained to these fluctuations in volume [1-4]. 

Entrainment to the volume envelope may represent an 
active neural mechanism to boost perceptual sensitivity to 
rhythmic stimuli [2, 5-7]. Although entrainment is partly 
driven by bottom-up features of the stimulus [8-10], it also 
depends on top-down signals to auditory cortex from other 
brain areas. Auditory entrainment is strengthened when people 
see congruent visual and auditory information [11, 12], and 
is modulated by attention [13] and by top-down signals from 
frontal cortex [4, 14]. 

Cortical entrainment is proposed to perform a key role in 
speech comprehension, such as segmenting out syllables from 
a continuous speech stream [1, 2, 15] or optimizing perceptual 
sensitivity to rhythmic pulses of sound [5-7]. However, the 
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mechanisms driving entrainment to speech remain unclear. 
We consider two hypotheses. First, flexible entrainment to 
quasi-periodic rhythms may be specific to auditory percep- 
tion [6]; in visual perception, by contrast, cortical oscillations 
in the alpha band (8-12 Hz) may phase-lock only to consis- 
tent stimulus rhythms, without adjusting to variable stimulus 
rhythms [16]. Second, low-frequency cortical entrainment may 
be a general-purpose neural mechanism that helps optimize 
perception to time-varying stimuli regardless of the perceptual 
modality. Neural oscillations may allow the brain to rhythmi- 
cally orient attention to quasi-periodic stimuli [5, 17] across 
sensory systems. 


Because previous studies of cortical entrainment to rhythms 
in language have focused on oral speech, they have been un- 
able to distinguish between these competing hypotheses. Here 
we test for low-frequency entrainment to a purely visual lan- 
guage: American Sign Language (ASL). Prior studies show 
that neural and behavioral oscillations in vision are prefer- 
entially entrained by stimuli that flicker in the alpha band 
[18-21]. Therefore, if flexible cortical entrainment to oral 
speech depends on modality-specific properties of auditory 
processing, then phase-locking to sign language should be 
concentrated in the alpha band, if it occurs at all [16]. Alter- 
natively, if cortical entrainment is a generalized neural strategy 
to maximize sensitivity to rhythmic stimuli, then oscillatory 
activity in visual cortex should entrain at the frequency of 
informational changes in ASL. 


To determine whether human cerebral cortex entrains to 
rhythmic information in sign language, first we developed 
a metric for quantifying the amplitude of visual change in 
sign, by analogy to the acoustic envelope of speech. Next, 
we characterized visual variability across four sign languages, 
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Fig. 1. Calculation of the Instantaneous Visual Change (IVC). The IVC summarizes 
total visual change at each point in time. First, the difference between adjacent 
grayscale video frames (top row) is calculated for each pixel. To aggregate over both 
increases and decreases in brightness, these pixel-wise differences are then squared 
(middle row). Finally, the brightness values in all pixels of the squared-difference 
images are summed to obtain a single value summarizing the magnitude of change 
between two video frames. Computation of this value for each adjacent pair of frames 
yields a time-series (bottom). 


showing that this variability is quasi-periodic below 8 Hz. 
Finally, we demonstrated that cerebral cortex entrains to visual 
variability in sign language, and showed that entrainment is 
strongest around the frequencies of phrases and individual 
signs in ASL. 


Results 


Developing a metric for quantifying visual change. In order 
to examine neural entrainment to visual rhythms in sign lan- 
guage, we must first quantify the ‘amplitude envelope’ of a 
visual signal. The acoustic envelope is a highly reduced repre- 
sentation of sound, tracing extreme amplitude values in the 
time-varying signal. Oscillations in the envelope of speech de- 
pend on movements of various components of the vocal tract, 
including the rhythmic opening and closing of the mandible 
[22]. Sign language, in contrast, does not involve consistent 
oscillatory movements by any single effector [23]. However, 
quasi-periodic oscillations in sign language may arise from the 
coordinated movements of multiple effectors. 

Here we present the Instantaneous Visual Change (IVC) 
as a metric that is conceptually similar to the acoustic enve- 
lope, summarizing the amplitude of change at each time point. 
The IVC is a time-series of aggregated visual changes be- 
tween frames (Fig. 1, Method and Materials). This algorithm 
provides an automatic, objective alternative to human-coded 
methods of studying temporal structure in sign [24]. 

The amplitude of the IVC indexes the amount of visual 
change between two video frames. The largest peaks in the 
IVC therefore occur during large, quick movements. In the 
videos we analyzed, these changes corresponded primarily to 
movements of the signers’ hands and arms, but may also reflect 
movements of the face, head, and torso. For example, a quick 
arm movement results in a larger number of pixels changing 
in each adjacent frame — and a higher peak in the IVC — than 
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a slow arm movement. The IVC thus offers a heuristic index 
of new linguistic information in the visual signal. An example 
video illustrating the IVC is included in the online Supporting 
Information (Video $1). 


Characterizing temporal structure in sign language. The [VC 
allows us to characterize one dimension of the temporal struc- 
ture in sign language, and to directly compare the spectral sig- 
natures of amplitude variability across sign and oral speech. Vi- 
sual examination of the raw IVC of sign language reveals quasi- 
periodic oscillations with irregularly-timed peaks (Fig. 2A). 
To characterize variability within and across sign languages, 
we computed the power spectra of the IVC from samples of 
four different sign languages: American Sign Language (ASL), 
German Sign Language (Deutsche Gebardensprache, DGS), 
British Sign Language (BSL), and Australian Sign Language 
(Auslan). These languages developed independently of the 
oral languages spoken in these countries, and come from three 
genetically unrelated language families (BSL and Auslan from 
the BANZL family, ASL from the French Sign Language fam- 
ily, DGS from the German Sign Language family). In all four 
languages, power in the IVC decreases monotonically with 
increasing frequency, without any pronounced peaks in the 
spectrum (Fig. 2C). We tested for rhythmic components in 
the IVC by comparing these spectra against the 1/f spectrum 
characteristic of many signals in the natural world. Power 
that is higher than the 1/f function indicates periodicity at 
that frequency [22]. The IVC of sign language showed elevated 
power at approximately 2-8 Hz (Ps < .01). Individual signs 
in sign languages tend to occur at approximately 2—-2.5 Hz 
[25, 26], on the lower end of the rhythmic components in the 
IVC. These analyses suggest that sign language involves weak, 
quasi-periodic rhythms with variable frequencies in the delta 
and theta range. 

To explicitly compare temporal structure between sign and 
speech, we contrasted the IVC of sign with the broadband 
envelope of speech [22]. We computed the broadband envelope 
of samples from nine spoken languages representing five lan- 
guage families: English, French, Portuguese, Dutch, German, 
Hungarian, Japanese, Arabic, and Mandarin. After resam- 
pling the broadband envelopes and IVC signals to a common 
frequency (30 Hz) and standardizing the amplitude of each 
recording by dividing out its standard deviation, we compared 
the average spectra of the IVC of sign and the broadband en- 
velope of speech (Fig. 2D). Spoken languages showed stronger 
modulations than sign languages above 2 Hz. This increased 
power may reflect modulation due to syllables in speech, which 
occur at approximately 2-10 Hz [22, 27]. Indeed, peaks from 
individual syllables are visible in the broadband envelope, and 
these peaks occur at about 4 Hz (Fig. 2B). These results in- 
dicate that visual motion in sign language is modulated at 
lower frequencies than auditory volume in spoken language. 
This difference is consistent with the slower movements in the 
articulators for sign (the hands) than in the articulators for 
speech (the vocal tract) [25, 26]. 


Cortical coherence to visual rhythms in sign language. We 
used electroencephalography (EEG) to examine cortical en- 
trainment to quasi-rhythmic fluctuations in visual information 
in sign language. Fluent speakers of ASL watched videos of 
ASL stories against a static background. We tested for co- 
herence of low-frequency electrophysiological oscillations to 
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Fig. 2. Temporal structure in signed and spoken language. (A) Example trace of the 
IVC of ASL. (B) Example trace of the broadband envelope of English. (C) Spectrum 
of sign language plotted on log-log axes. Line color denotes language (BSL: British 
Sign Langauge; DGS: German Sign Language; Auslan: Australian Sign Language; 
ASL: American Sign Language). Each curve shows the spectrum of a separate video 
sample (N = 14, total duration 1:10:22). The black line shows the best-fit 1/ f 
trend across all samples. The gray bar at the base of the plot shows where the IVC 
spectra are significantly greater than the 1/f fit (P < .01 by 1-sample t-tests). (D) 
Comparison of the mean spectra across signed and spoken languages. Shaded area 
depicts the standard error of the mean. The gray bar indicates significant differences 
between the two curves (P < .01 by independent-samples t-tests). Sign language 
samples are the same as in panel (C). Audio recordings were sampled from speech 
in 9 languages (N = 12, total duration 1:07:28). Amplitude in all analyses has been 
standardized by each recording’s standard deviation. 


quasi-periodic oscillations in the IVC. 


Coherence was calculated separately at each EEG channel 
in partially overlapping, logarithmically spaced bins centered 
over 0.5-16 Hz (Materials and Methods). Because coherence 
provides no intrinsic measure of chance performance, we cre- 
ated a null distribution of coherence using a randomization 
procedure. To obtain each value in the null distribution, we 
time-shifted the IVC to a randomly-selected starting point, 
moving the portion of the IVC that remained after the final 
time point of the EEG signal to the beginning of the recording. 
This procedure preserves the spectral and temporal charac- 
teristics of the EEG and IVC recordings, but eliminates any 
relationship between these signals. Coherence was then com- 
puted between the EEG recordings and the randomly-shifted 
IVC. 

A cluster-based permutation test indicated that coherence 
between cortical oscillations and the IVC of sign was stronger 
than would be expected by chance (P = .0001). Averaging 
the coherence spectrum across every EEG channel, coherence 
was above chance from 0.4—5 Hz, peaking at 1 Hz (Fig. 3A). 
Coherence emerged over a similar range of frequencies when 
we selected only occipital channels (0.8-5 Hz; P = .0001), 
primarily reflecting entrainment in visual cortex (Fig. 3B). In 
frontal channels, above-chance coherence was present from 
0.4-1.25 Hz (P = .0001; Fig. 3C), revealing top-down control 
from frontal cortex. Examining the entire scalp distribution, 
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cortical coherence to the IVC of sign language was strongest 
over central and occipital channels (Fig. 3D). 

To test whether cortical entrainment depends on linguistic 
knowledge, we examined coherence to sign language in people 
who did not know any ASL. Like signers, non-signers showed 
significant coherence to videos of ASL storytelling (P < .0005), 
with the strongest coherence over central and occipital channels 
from 0.8-3.5 Hz (Fig. S1). We then separately analyzed effects 
of linguistic knowledge on entrainment in occipital and frontal 
cortex. Although coherence at occipital channels did not 
significantly differ between groups (Fig. 4B), coherence at 
frontal channels was stronger in signers than in non-signers, 
indicating differences in top-down control based on familiarity 
with ASL (P < .05; Fig. 4A; Fig. S1). 


Discussion 


Cortical coherence to sign language. In this study, we find 
that electrophysiological oscillations in human cerebral cortex 
become entrained to quasi-periodic fluctuations of visual move- 
ment in sign language. In fluent signers, cortical entrainment 
to sign language was found between 0.4 and 5 Hz, peaking 
at about 1 Hz, and emerged most robustly over occipital and 
central EEG channels. These results show that the human 
brain entrains to low-frequency variability in language whether 
it is perceived with the ears or eyes. Visual cortex flexibly 
phase-locks to visible changes in sign language, analogously 
to the way auditory cortex phase-locks to amplitude changes 
in oral speech. Our findings argue that flexible entrainment 
depends on mechanisms that are not specific to any given 
effector or sensory modality. 

Prior results suggest that auditory and visual perception 
are differentially modulated by rhythms at different frequencies 
[16]. Auditory sensitivity varies as a function of the power 
and phase of spontaneous 2-6 Hz rhythms [28], and these 
oscillations are entrained by sounds modulated at 3 Hz [29, 30]. 
Visual sensitivity, by contrast, depends on the power and phase 
of spontaneous alpha rhythms [31-33], and electrophysiological 
oscillations in visual cortex are robustly entrained by periodic 
stimulation around 10 Hz [19, 21]. When humans watch a 
light flicker at frequencies from 1-100 Hz, visual cortex shows 
the strongest entrainment around 10 Hz [18]. 

Although these rhythmic preferences are not absolute (vi- 
sual cortex also shows rhythmic oscillations in the delta/theta 
range [11, 17, 34-37]), differences between the sensory modali- 
ties are apparent when different frequency bands are directly 
compared. Auditory detection sensitivity depends on the phase 
of underlying delta-theta but not alpha oscillations [28]. In 
response to aperiodic stimulation, visual cortex oscillates in 
the alpha band [20], whereas auditory cortex does not show 
consistent oscillatory activity [38]. 

If sensory-specific oscillatory preferences determined the 
spectrum of entrainment, then peak coherence to sign language 
should be observed at the frequencies preferred by visual cortex: 
around 10 Hz. Contrary to this prediction, we find that cortical 
coherence to sign language only emerges below 5 Hz. Cerebral 
cortex entrains to sign language around the frequencies of 
words and phrases in ASL. 


Coherence across signers and non-signers. We find that cere- 
bral cortex phase-locks to visual changes in ASL both in fluent 
signers and in people with no knowledge of sign language. In 
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Fig. 3. Coherence between cortex and the IVC of sign language in fluent signers (NV = 13). (A) Coherence spectrum averaged across all EEG channels. For each participant, 
we computed the difference in empirical coherence and a distribution of cortical coherence to randomly shifted IVC. The solid line shows the mean difference in empirical and 
randomized coherence across participants, and the shaded area shows the 95% Cl around the mean. The dotted line shows chance levels. (B) Coherence spectrum averaged 
over occipital channels. Inset shows the location of selected channels. (C) Coherence spectrum averaged over frontal channels. Inset shows the location of selected channels. 


(D) Scalp topography of coherence in each frequency bin, averaged across participants. 
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Fig. 4. Comparison of coherence in fluent signers (V 13) and non-signers 
(N = 15). (A) Coherence at frontal channels (channel selection illustrated at right) 
was stronger in signers than in non-signers. (B) Coherence at occipital channels did 
non differ between groups. Data from signers is the same as in Fig. 3. *, P < .05; 
n.s., not statistically significant. 


principle, coherence to ASL in non-signers could emerge for 
two reasons. Coherence could be driven either bottom-up, by 
sensory stimulation, or top-down, by non-linguistic temporal 
predictions. Human bodies move in predictable ways, and 
non-signers could entrain to sign language based on these 
regularities in human movement. 


In frontal areas, fluent signers showed stronger coherence 
than non-signers. This difference in frontal coherence may 
reflect top-down sensory predictions based on knowledge of 
ASL. Alternatively, differences in coherence could reflect differ- 
ences in attention to the videos. Cortical entrainment to oral 
speech decreases when people direct attention away from the 
speech stimulus [13, 39]. Reduced coherence in non-signers, 
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therefore, would also be predicted if non-signers do not attend 
to videos of ASL as strongly as fluent signers do. However, 
our findings at occipital channels argue against this possibility. 
If differences between groups were driven by attention, then 
occipital coherence should be stronger in signers than in non- 
signers. However, we find no evidence that occipital coherence 
depends on linguistic knowledge. Taken together, these results 
suggest that although linguistic knowledge is not necessary for 
entrainment, signers may leverage knowledge about ASL to 
sharpen temporal predictions during language comprehension. 
These sharpened predictions result in stronger entrainment 
in the frontal regions that exert top-down control over visual 
perception. 


Specialization for speech?. Syllables in oral speech occur at 
frequencies that largely overlap with cortical entrainment to 
the volume envelope. This overlap could be interpreted as 
evidence for a specialized oscillatory mechanism for speech 
comprehension. This type of speech-specific mechanism could 
evolve in at least two ways. First, “the articulatory motor sys- 
tem [may have] structured its output to match those rhythms 
the auditory system can best apprehend” [6]. Second, audi- 
tory mechanisms may have developed to comprehend speech 
based on the timing of preexisting oral behaviors. Non-human 
primates create vocalizations and facial displays that fluctuate 
at frequencies similar to human speech syllables [40], and their 
attention is preferentially captured by faces that move at these 
frequencies [41]; perhaps auditory processing evolved to fit the 
timing profile of these behaviors. 

The data we report here, however, suggest that entrain- 
ment may not have any close evolutionary link to oral speech. 
Instead, a more general process may underlie cortical phase- 
locking to variability in language. Previous results are consis- 
tent with this interpretation as well. When participants watch 
videos of speech, entrainment emerges not only in auditory 
cortex, but also in visual cortex [11, 35, 36]. Furthermore, 
cortical rhythms entrain to rhythms in music [42] and to other 
rhythmic stimuli in audition [29, 30] and vision [19, 43]. These 
examples of low-frequency cortical entrainment to a broad 
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range of stimuli across sensory modalities suggest that the 
cortical mechanisms supporting entrainment to the volume 
envelope of speech may be a specialized case of a general 
predictive process. 


Neural mechanisms of language comprehension across sen- 
sory modalities. Previous studies have shown that the func- 
tional neuroanatomy of speech largely overlaps with that of 
sign [44]. At the coarsest level of anatomical specificity, the 
left hemisphere is specialized for spoken language. The left 
hemisphere is also asymmetrically active during sign language 
perception [45] and production, regardless of which hand peo- 
ple use to sign [46, 47]. Left hemisphere damage, furthermore, 
results in linguistic deficits in signing patients [48]. 

Specific regions within the left hemisphere show similar 
involvement in processing both speech and sign. Across signed 
and spoken language, bloodflow increases to the LIFG and left 
inferior parietal lobe (IPL) during phonemic discrimination 
[49, 50] and morphosyntactic processing [51]. Similarly, word 
production in both signed and spoken languages activates 
LIFG, left IPL, and left temporal areas [52]. 

Differences in the cortical areas involved in sign and speech 
can often be attributed to differences in the form of these 
languages. For example, comprehension of sign language ac- 
tivates primary visual but not primary auditory cortex [45]. 
Consistent with the fact that sign language relies on spatial 
contrasts, inferior and superior parietal cortex is more strongly 
active during signed than during spoken language production 
[52] and perception [49]. 

Our findings go beyond functional neuroanatomy to ex- 
amine neurophysiological processes that can arise in multi- 
ple cortical areas. We show that oscillatory entrainment to 
low-frequency variability in the stimulus occurs regardless of 
whether language is being processed using auditory cortex or 
visual cortex. 

Our results differ from previous studies on entrainment 
to speech primarily in the scalp topography of coherence. 
Entrainment to auditory speech is strongest over auditory 
cortex [2, 8, 11, 35] and central frontal sites [14]. By contrast, 
our results show that entrainment to sign language is strongest 
at occipital and parietal channels, consistent with greater 
parietal activation during sign compared with speech [53]. This 
difference likely reflects increased visual and spatial demands 
of perceiving sign language. 


The IVC quantifies temporal structure in visual perception. 
The Instantaneous Visual Change (IVC) provides a novel 
method for examining gross temporal structure in natural 
visual stimuli. Analogously to the way the broadband enve- 
lope summarizes early stages of auditory processing, the IVC 
provides a first approximation of the magnitude of information 
available to the earliest stages of visual processing. At the 
first stage of auditory transduction, hair cells in the cochlea 
extract the narrowband envelope of sounds. Summing these 
narrowband envelopes together yields the overall auditory re- 
sponses over time: the broadband envelope. In the retina, 
center-surround retinal ganglion cells respond to changes in 
the brightness of specific wavelengths of light. Summing the 
responses from these cells yields the overall visual responses 
over time, approximated by the IVC. 

The IVC provides a coarse index of visual information 
in sign language, just as the broadband envelope provides a 
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coarse index of information in speech. For example, the volume 
envelope does not reflect small spectral differences that are 
crucial for discriminating vowels. The IVC, analogously, does 
not preserve information about which effectors are moving or 
their trajectories. Nevertheless, sign language comprehenders 
may use the IVC of sign heuristically, as listeners use the 
acoustic envelope of speech, to anticipate when important 
information is likely to appear. 

In the current study, we use the IVC to characterize tem- 
poral structure in sign language, and to examine responses of 
the human brain to that temporal structure. The IVC could 
also be applied to study temporal structure in other domains, 
such as gesture, biological motion, and movement in natural 
scenes. 


The functional role of entrainment to language. Oscillatory en- 
trainment to language may be a specific case of a general cor- 
tical mechanism. In primates, spiking probability varies with 
the phase of low-frequency oscillations: neurons are most likely 
to fire at specific points in the phase of ongoing oscillations 
[54, 55]. Perhaps the cortex strategically resets the phase of 
ongoing neural oscillations to ensure that perceptual neurons 
are in an excitable state when new information is likely to 
appear [5—7, 14]. Oscillatory entrainment may constitute a 
cortical strategy to boost perceptual sensitivity at informa- 
tional peaks in language. Our findings suggest that the brain 
can flexibly entrain to linguistic information regardless of the 
modality in which language is produced or perceived. 


Materials and Methods 


Participants watched approximately 20 minutes of naturalistic 
storytelling in American Sign Language (ASL) while EEG was 
recorded. Participants were instructed to watch the videos and 
remain still and relaxed. All procedures were approved by the 
Institutional Review Board of the University of Chicago. Detailed 
methods and analyses are available in the online Supporting Text. 


Participants. Participants had corrected-to-normal vision and re- 
ported no history of epilepsy, brain surgery, or traumatic brain 
injuries. Informed consent was obtained before beginning the exper- 
iment, and participants were paid $20/hour for their participation. 
We recorded EEG from 16 fluent signers of American Sign Language 
(ASL). Data from 2 participants were excluded before analysis due 
to excessive EEG artifacts, and data from 1 participant were lost 
due to experimenter error. All participants retained in the analyses 
began learning ASL by 5 years of age (N = 13; 3 female, 10 male; 
age 24-44; mean age of acquisition 1.1 years). Participants who 
used hearing aids or cochlear implants removed the devices before 
beginning the experiment. A fluent speaker of ASL (J.L.) answered 
participants’ questions about the study. We recorded EEG from 
an additional 16 participants who had no prior exposure to ASL. 
These participants were recruited from the University of Chicago 
community through online postings. One participant who was cur- 
rently learning ASL was excluded before analyses, leaving N = 15 
non-signing participants (10 female, 5 male; age 18-31). 


Instantaneous Visual Change (IVC). The IVC represents a time-series 
of aggregated visual change between frames (Fig. 1), and is com- 
puted as the sum of squared differences in each pixel across sequen- 
tial frames of video: 


IVC(t) = Ss" [xi(t) — 4(t — 1)]? 


where x is the grayscale value of pixel i at time t. Python code for 
the IVC is available at http://casasanto.com/geoffbrookshire/ivc.py. 
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EEG analysis. See Supporting Text for details about EEG acquisition 
and preprocessing. To compute coherence, IVC and EEG data were 
filtered into overlapping log-spaced frequency bins using phase- 
preserving forward-reverse Butterworth bandpass filters. Bins were 
centered on values from 0.5-16 Hz, and included frequencies in 
the range (0.8f,1.25f), where f is the center frequency f = 2” 
for n € {—1, —0.5,0,...,4}. Instantaneous phase and power were 
determined with the Hilbert transform. Power was computed as 
the absolute value of the analytic signal, and phase as the angle of 
the analytic signal. These power and phase estimates were then 
used to calculate coherence: 


Coh | i hes / Poe Pre) 
0. = 
J >. (Pot : Py,t) 


where t is the time point, 0 is the phase difference between the IVC 
and EEG, Py is power in the IVC, and Po is power in the EEG 
recording [9]. Statistical significance of coherence was determined by 
a two-stage randomization procedure. First, the IVC was randomly 
shifted to obtain a null distribution of coherence between the two 
signals. Second, statistical significance was determined using cluster- 
based permutation tests [56] (Supporting Text). 
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SI Materials and Methods 


Spectral analysis of speech and sign 


We analyzed temporal structure in multiple spoken and signed languages. These analyses were 
performed using custom software written in Python. 

Spoken language samples comprised audio recordings of stories in 9 languages from 5 languages 
families (N=12 samples, total duration 1:07:28; Table S1). Before computing the spectra, these 
recordings were trimmed to a maximum of 6 min (mean 5:37, SD 0:52), mixed down to 1 output 
channel, and downsampled to 22050 Hz. 

Sign language samples were chosen to be as similar as possible to the spoken samples (N=14, 
total duration 1:10:22, mean 5:02, SD 3:19; Table S2), representing 4 languages from 3 families. All 
samples comprised forward-facing views of a single speaker. In 12 videos, the speaker told a story; 
in the remaining 2 videos, the speaker gave an informational speech. The IVC was calculated for 
each video. 

All IVC and broadband envelope recordings were resampled to a common frequency of 30 Hz, 
and power was normalized by dividing out the SD of each signal. Power spectra were computed 
using Welch’s method. Signals were split into 2.13-s long segments (2° samples) that overlapped 
by 1.07 s (2° samples). A Hanning window was applied to each segment, and the linear trend was 
removed. Fast Fourier transforms (FFT) were then computed for each segment. The spectrum for 
each signal was obtained by averaging the spectra in all segments. 

We fit sign language IVC spectra to a 1/f function using least squares regression by trans- 
forming the equation Y = Af~° into the linear form log Y = log A — alog f, with power Y and 
frequency f [22]. We tested for deviations from the 1/f trend using 1-sample t-tests. 


Broadband envelope 


The broadband envelope was computed over samples of oral speech by adapting methods from 
previous studies [22]. First, the raw waveform was band-pass filtered into 25 logarithmically- 
spaced frequency bands from 0.1—10 kHz using least-squares filters with 501 taps. The narrowband 
envelope of each filtered signal was then calculated as the absolute value of the Hilbert transform. 
The narrowband envelopes were summed to obtain the broadband envelope for each recording. 


Stimuli 


Stimuli comprised two videos of ASL storytelling: “Kondima” and “Little Feet” from the Rosa 
Lee Show [57]. Each video was approximately 10 min long with a 30 Hz frame-rate, and depicted 
a native speaker of ASL telling a story against a static background. To ensure that the timing of 
the video was accurately matched with EEG recordings, a small white square flashed in the corner 
of the display once every 30 frames of video (out of view of the participant), and was registered by 
a photodiode connected to the EEG amplifier. In total, the session lasted 60-90 minutes. 


EEG acquisition and preprocessing 


EEG was recorded at 250 Hz using a 128-channel net (Electrical Geodesics, Eugene, OR). Impedances 
were reduced to < 50 kQ“. before participants watched each story. EEG analyses were performed in 
Matlab using custom software and the FieldTrip package [58]. Recordings from electrodes along 


the face, beneath the ears, and at the base of the neck were excluded before any analysis, leaving 
103 channels in all analyses. Electrode movement artifacts were manually identified and rejected by 
replacing the tagged region with zeros and applying a 4000-ms half-Hann taper to each side of the 
artifact. Artifacts from blinks and eye-movements were identified and removed using independent 
component analysis (ICA). To ensure that the IVC of each story was accurately matched to the 
EEG recordings, we used cubic spline interpolation to warp the IVC to the time-stamps registered 
by the photodiode. This process simultaneously resampled the IVC from 30 Hz to 250 Hz. Before 
computing coherence, EEG signals were re-referenced to the average mastoids. 


EEG statistical testing 


Statistical significance of EEG coherence to the IVC was determined by a two-stage randomization 
procedure. To obtain a null distribution of coherence, the onset of the IVC was circularly shifted 
to a randomly selected starting point, and coherence was computed between EEG signals and 
the shifted IVC. This procedure preserves the spectrotemporal characteristics of both signals, 
but eliminates any relationship between them. For each subject, we computed 100 randomly 
shifted baselines. Next, we tested for significant differences between the empirical and randomly 
shifted coherence using a cluster-based non-parametric permutation test [56]. This test looked for a 
difference in coherence between the empirical and randomly shifted data, contrasted with N=10,000 
permutations in which the ‘empirical’ data was randomly selected from the group of empirical and 
randomly-shifted traces. For each frequency and each channel, a T-statistic was computed on the 
difference between empirical and randomly shifted data using a dependent-samples regression. The 
test statistic was computed as the maximum cluster size in each permutation (cluster threshold: 
a = .01, two-tailed). The p-value was calculated as the proportion of permuted cluster statistics 
that were more extreme than the empirical value. 

In previous studies, cortical entrainment to speech is often strongest over auditory cortex [2, 8, 
11, 35] and frontal cortex [14]. We tested coherence to sign language in two analogous regions of 
interest (ROIs): at occipital channels, and at frontal channels. 

To test whether knowledge of ASL influenced the strength of entrainment, we performed cluster 
permutation analyses on the difference in coherence between signers and non-signers at the occipital 
and frontal ROIs. For each participant, we normalized the data by computing the Z-score of their 
empirical coherence against the randomly shifted baseline coherence values. Z-scored empirical 
coherence was then compared between groups using cluster permutation tests. 


Data deposition 


All data, stimuli, and analysis scripts are stored in a repository at the University of Chicago, and 
can be obtained by contacting the corresponding author. 
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Fig. S1. Comparison of coherence across signing and non-signing participants. Coherence was 
robustly present in both groups of participants, regardless of experience with sign language. The 
top row shows coherence in each frequency bin for fluent signers, and the middle row shows coher- 
ence for non-signers. The bottom row shows the difference in coherence between the two groups. 
Positive values (shown in red) indicate stronger coherence in signers, and negative values (blue) 
indicate stronger coherence in non-signers. 
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Video $1. Example video illustrating how the Instantaneous Visual Change (IVC) tracks visual 
movement during sign language. 


